AITopics | projection loss

Perspective Transformer Nets: Learning Single-View 3D Object Reconstruction without 3D Supervision

Neural Information Processing SystemsMar-17-2026, 11:35:32 GMT

Understanding the 3D world is a fundamental problem in computer vision. However, learning a good representation of 3D objects is still an open problem due to the high dimensionality of the data and many factors of variation involved. In this work, we investigate the task of single-view 3D object reconstruction from a learning agent's perspective. We formulate the learning process as an interaction between 3D and 2D representations and propose an encoder-decoder network with a novel projection loss defined by the projective transformation. More importantly, the projection loss enables the unsupervised learning using 2D observation without explicit 3D supervision. We demonstrate the ability of the model in generating 3D volume from a single 2D image with three sets of experiments: (1) learning from single-class objects; (2) learning from multi-class objects and (3) testing on novel object classes. Results show superior performance and better generalization ability for 3D object reconstruction when the projection loss is involved.

artificial intelligence, machine learning, proceedings, (4 more...)

Neural Information Processing Systems

Technology: Information Technology > Artificial Intelligence > Machine Learning (0.78)

Add feedback

5d2e24df9cfaad3189833b819c40b392-Paper-Conference.pdf

Neural Information Processing SystemsFeb-9-2026, 07:15:26 GMT

computer vision, dataset, recognition, (11 more...)

Neural Information Processing Systems

Country:

Asia > China > Hong Kong (0.04)
Asia > China > Guangdong Province > Guangzhou (0.04)

Genre: Research Report (0.67)

Industry: Leisure & Entertainment > Sports (0.46)

Technology:

Information Technology > Artificial Intelligence > Vision (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Neural Networks (0.93)

Add feedback

Multi-dataset Training of Transformers for Robust Action Recognition

Neural Information Processing SystemsDec-24-2025, 07:09:22 GMT

We study the task of robust feature representations, aiming to generalize well on multiple datasets for action recognition. We build our method on Transformers for its efficacy. Although we have witnessed great progress for video action recognition in the past decade, it remains challenging yet valuable how to train a single model that can perform well across multiple datasets. Here, we propose a novel multi-dataset training paradigm, MultiTrain, with the design of two new loss terms, namely informative loss and projection loss, aiming tolearn robust representations for action recognition.

multi-dataset training, robust action recognition, transformer, (7 more...)

Neural Information Processing Systems

Technology: Information Technology > Artificial Intelligence (0.91)

Add feedback

Perspective Transformer Nets: Learning Single-View 3D Object Reconstruction without 3D Supervision

Neural Information Processing SystemsNov-21-2025, 15:32:51 GMT

Understanding the 3D world is a fundamental problem in computer vision. However, learning a good representation of 3D objects is still an open problem due to the high dimensionality of the data and many factors of variation involved. In this work, we investigate the task of single-view 3D object reconstruction from a learning agent's perspective. We formulate the learning process as an interaction between 3D and 2D representations and propose an encoder-decoder network with a novel projection loss defined by the projective transformation. More importantly, the projection loss enables the unsupervised learning using 2D observation without explicit 3D supervision. We demonstrate the ability of the model in generating 3D volume from a single 2D image with three sets of experiments: (1) learning from single-class objects; (2) learning from multi-class objects and (3) testing on novel object classes. Results show superior performance and better generalization ability for 3D object reconstruction when the projection loss is involved.

name change, object reconstruction, perspective transformer net, (3 more...)

Neural Information Processing Systems

Technology: Information Technology > Artificial Intelligence > Machine Learning (0.78)

Add feedback

Learning from Interval Targets

Pukdee, Rattana, Ke, Ziqi, Gupta, Chirag

arXiv.org Artificial IntelligenceOct-27-2025

We study the problem of regression with interval targets, where only upper and lower bounds on target values are available in the form of intervals. This problem arises when the exact target label is expensive or impossible to obtain, due to inherent uncertainties. In the absence of exact targets, traditional regression loss functions cannot be used. First, we study the methodology of using a loss functions compatible with interval targets, for which we establish non-asymptotic generalization bounds based on smoothness of the hypothesis class that significantly relaxing prior assumptions of realizability and small ambiguity degree. Second, we propose a novel min-max learning formulation: minimize against the worst-case (maximized) target labels within the provided intervals. The maximization problem in the latter is non-convex, but we show that good performance can be achieved with the incorporation of smoothness constraints. Finally, we perform extensive experiments on real-world datasets and show that our methods achieve state-of-the-art performance.

artificial intelligence, machine learning, natural language, (17 more...)

arXiv.org Artificial Intelligence

2510.20925

Country:

North America > United States > Pennsylvania > Allegheny County > Pittsburgh (0.04)
North America > United States > California > Santa Clara County > Palo Alto (0.04)
Europe > Slovenia > Drava > Municipality of Benedikt > Benedikt (0.04)

Genre:

Research Report > Experimental Study (1.00)
Research Report > New Finding (0.93)

Industry: Education (0.45)

Technology:

Information Technology > Artificial Intelligence > Representation & Reasoning (1.00)
Information Technology > Artificial Intelligence > Natural Language (0.68)
Information Technology > Artificial Intelligence > Machine Learning > Neural Networks > Deep Learning (0.67)
Information Technology > Data Science (0.67)

Add feedback

5d2e24df9cfaad3189833b819c40b392-Paper-Conference.pdf

Neural Information Processing SystemsAug-15-2025, 03:47:18 GMT

computer vision, dataset, recognition, (11 more...)

Neural Information Processing Systems

Country:

Asia > China > Hong Kong (0.04)
Asia > China > Guangdong Province > Guangzhou (0.04)

Genre: Research Report (0.67)

Industry: Leisure & Entertainment > Sports (0.46)

Technology:

Information Technology > Artificial Intelligence > Vision (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Neural Networks (0.93)

Add feedback

Reviews: Perspective Transformer Nets: Learning Single-View 3D Object Reconstruction without 3D Supervision

Neural Information Processing SystemsJan-20-2025, 21:49:52 GMT

This paper attempts to reconstruct a 3D volume for an object from a single image at test time. During training time it uses a number of views of the object to reconstruct a 3D volume containing the object where the volume is broken down into smaller voxels and the network predicts whether each voxel is occupied or not. The input is an image of the object only against a white background. They chose to ignore color and texture in their reconstruction work. The network they suggest is an encoder-decoder network where one half encodes an images into a 3D invariant latent representation and the decoder does dense reconstruction of only that object.

object reconstruction, perspective transformer net, supervision, (7 more...)

Neural Information Processing Systems

Technology: Information Technology > Artificial Intelligence > Vision (0.38)

Add feedback

Multi-dataset Training of Transformers for Robust Action Recognition

Neural Information Processing SystemsOct-11-2024, 05:55:59 GMT

We study the task of robust feature representations, aiming to generalize well on multiple datasets for action recognition. We build our method on Transformers for its efficacy. Although we have witnessed great progress for video action recognition in the past decade, it remains challenging yet valuable how to train a single model that can perform well across multiple datasets. Here, we propose a novel multi-dataset training paradigm, MultiTrain, with the design of two new loss terms, namely informative loss and projection loss, aiming tolearn robust representations for action recognition. We verify the effectiveness of our method on five challenging datasets, Kinetics-400, Kinetics-700, Moments-in-Time, Activitynet and Something-something-v2 datasets.

dataset, robust action recognition, transformer, (5 more...)

Neural Information Processing Systems

Genre: Play > Prospect (1.00)

Technology: Information Technology > Artificial Intelligence > Vision (1.00)

Add feedback

Rethinking Backdoor Attacks on Dataset Distillation: A Kernel Method Perspective

Chung, Ming-Yu, Chou, Sheng-Yen, Yu, Chia-Mu, Chen, Pin-Yu, Kuo, Sy-Yen, Ho, Tsung-Yi

arXiv.org Artificial IntelligenceNov-28-2023

Dataset distillation offers a potential means to enhance data efficiency in deep learning. Recent studies have shown its ability to counteract backdoor risks present in original training samples. In this study, we delve into the theoretical aspects of backdoor attacks and dataset distillation based on kernel methods. We introduce two new theory-driven trigger pattern generation methods specialized for dataset distillation. Following a comprehensive set of analyses and experiments, we show that our optimization-based trigger design framework informs effective backdoor attacks on dataset distillation. Notably, datasets poisoned by our designed trigger prove resilient against conventional backdoor attack detection and mitigation methods. Our empirical results validate that the triggers developed using our approaches are proficient at executing resilient backdoor attacks.

asr, backdoor attack, dataset, (15 more...)

arXiv.org Artificial Intelligence

2311.16646

Country:

Asia > Taiwan > Taiwan Province > Taipei (0.04)
Asia > China > Hong Kong > Sha Tin (0.04)
North America > United States > New York (0.04)
Asia > Nepal (0.04)

Genre: Research Report > New Finding (0.66)

Industry: Information Technology > Security & Privacy (1.00)

Technology:

Information Technology > Security & Privacy (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Kernel Methods (0.60)
Information Technology > Artificial Intelligence > Machine Learning > Neural Networks > Deep Learning (0.48)

Add feedback

Leveraging Skill-to-Skill Supervision for Knowledge Tracing

Kim, Hyeondey, Nam, Jinwoo, Lee, Minjae, Jegal, Yun, Song, Kyungwoo

arXiv.org Artificial IntelligenceJun-11-2023

Knowledge tracing plays a pivotal role in intelligent tutoring systems. This task aims to predict the probability of students answering correctly to specific questions. To do so, knowledge tracing systems should trace the knowledge state of the students by utilizing their problem-solving history and knowledge about the problems. Recent advances in knowledge tracing models have enabled better exploitation of problem solving history. However, knowledge about problems has not been studied, as well compared to students' answering histories. Knowledge tracing algorithms that incorporate knowledge directly are important to settings with limited data or cold starts. Therefore, we consider the problem of utilizing skill-to-skill relation to knowledge tracing. In this work, we introduce expert labeled skill-to-skill relationships. Moreover, we also provide novel methods to construct a knowledge-tracing model to leverage human experts' insight regarding relationships between skills. The results of an extensive experimental analysis show that our method outperformed a baseline Transformer model. Furthermore, we found that the extent of our model's superiority was greater in situations with limited data, which allows a smooth cold start of our model.

artificial intelligence, machine learning, natural language, (15 more...)

arXiv.org Artificial Intelligence

2306.06841

Country: Asia > South Korea > Seoul > Seoul (0.05)

Genre: Research Report (1.00)

Industry:

Education > Educational Technology > Educational Software > Computer Based Training (0.68)
Education > Educational Setting > Online (0.47)

Technology:

Information Technology > Artificial Intelligence > Natural Language (0.88)
Information Technology > Artificial Intelligence > Machine Learning > Neural Networks > Deep Learning (0.47)

Add feedback

Collaborating Authors

projection loss

Information about AI from the News, Publications, and Conferences

Automatic Classification – Tagging and Summarization – Customizable Filtering and Analysis

Perspective Transformer Nets: Learning Single-View 3D Object Reconstruction without 3D Supervision

5d2e24df9cfaad3189833b819c40b392-Paper-Conference.pdf

Multi-dataset Training of Transformers for Robust Action Recognition

Perspective Transformer Nets: Learning Single-View 3D Object Reconstruction without 3D Supervision

Learning from Interval Targets

5d2e24df9cfaad3189833b819c40b392-Paper-Conference.pdf

Reviews: Perspective Transformer Nets: Learning Single-View 3D Object Reconstruction without 3D Supervision

Multi-dataset Training of Transformers for Robust Action Recognition

Rethinking Backdoor Attacks on Dataset Distillation: A Kernel Method Perspective

Leveraging Skill-to-Skill Supervision for Knowledge Tracing